Pollinator Indicators

Authors
Affiliation

Kwaku Peprah Adjei

Norwegian institute for Nature Research

Anders Kolstad

Norwegian institute for Nature Research

Markus A.K. Sydenham

Norwegian institute for Nature Research

Published

August 19, 2024

Status

This indicator documentation is incomplete and needs further developement before indicator values can be calculated.

Recomended citation

Adjei, K. P, Kolstad, A. , Sydenham, M. Pollinator Indicators Pollinator Indicators v. 000.001 ecRxiv https://github.com/NINAnor/ecRxiv

Version

000.001

Table 1: Indicator metadata
Indicator ID NA
Indicator Name Pollinator Indicators
Country Norway
Continent Europe
Ecosystem Condition Typology Class NA
Realm Terrestrial (T)
Biome NA
Ecosystem NA
Year added 2024
Last update 2024
status incomplete
Version 000.001
Version comment First draft
url https://github.com/NINAnor/ecRxiv

Pollinator potential indicators





1. Introduction

We aim to model the diversity of pollinators in Norway given their know interaction with some plant species. From the estimated diversity across the nation, we intend to look at the estimates in the ASO Data meadows by using their values as the reference values. To achieve this, we do the following:

    1. obtain data from the Global Biodiversity Infrastructure Facility (GBIF ) for bumblebees, butterflies and bees.
    1. fit an integrated species distribution model (ISDM) to the data to obtain the distribution of pollinators
    1. obtain a web plot interaction matrix of the pollinator and plant species
    1. get data on the plant species within the interaction matrix from GBIF
    1. fit an ISDM data obtained in step (d) to estimate their respective species occurrence probability
    1. estimate the alpha diversity index of the pollinators across Norway using the results obtained in (b), (c) and (d).
Figure 1: Flowchart of the modelling framework used in estimating the pollinator potential in this project. The modelling start with A) obtaining bees, bumblebees and hoverflies data from GBIF and fitting ISDMs to the pollinator data, then B) obtaining data on some selected plant species from GBIF and fitting ISDMs to the plant data, C) constructing the interraction between the plant and pollinators (from a given data) and D) estimating the alpha diversity indices. The covariate used to fit the ISDMs are presented in Table 7. The dataset and covariates are enclosed in dashed green-colored rectangular regions.

2. About the underlying data

The pollinator (GBIF.Org User 2024a) and plant species (GBIF.Org User 2024b) data used to fit the ISDMs were obtained from GBIF. The data was downloaded via the R-package rgbif (Chamberlain et al. 2024), and extract the associated metadata using the same R-package. We downloaded GBIF data contains \(26\) datasets and \(30\) datasets with records on pollinators and plants respectively. We present a summary of the datasets downloaded for the pollinator and plant species respectively in Table 5 and Table 6 respectively.

From the metadata, we ascertain the type of each dataset: either presence-only, presence-absence or count data. We merge all the presence-only data into one dataset (which we call mergedDatasetPO), but we do not merge the rest of the datasets. This is because the presence-absence data (procesed from sampling event protocols) have different sampling protocols, and we would like to preserve their unique attributes. A summary of the formatted dataset used to fit the ISDMs are presented in Table 2 and Table 3.

Table 2: Number of pollinator occurrences from the formatted datasets obtained from GBIF. All presence-only datasets are merged into one dataset (mergedDataset).
Dataset name No. of bees occurrences No. of butterflies occurrences No. of hoverflies occurrences
mergedDatasetPO 22316 61009 17555
National insect monitoring in Norway 182 157 241
Saproxylic insects caught in window traps and hatched from polypores in small and large old forests in southern Norway 5 5 5
Solitary bees collected in a large-scale field experiment in power line clearings, southeast Norway 391 0 0
Freshwater benthic invertebrates ecological collection NTNU University Museum 3560 3569 5433
Bumble bees collected in a large-scale field experiment in power line clearings, southeast Norway 78 0 0
Bumblebees and butterflies in Norway 5687 5192 0
Freshwater pelagic invertebrates ecological collection NTNU University Museum 1573 1573 1574
Table 3: Number of plant occurrences from the formatted datasets obtained from GBIF. All presence-only datasets are merged into one dataset (mergedDataset).
Species mergedDatasetPO Monitoring data of natural and man-made semi-natural meadows in and around Oslo, Norway 2018-2021 Vegetation data with and without experimental warming, alpine Finse 2000, 2004, 2011 Effects of vegetation clearing on vascular plants in power line clearings southeast Norway Vascular plants in power line clearings and the nearby forest, southeast Norway Overvåking av semi-naturlig eng (ASO)
Ajuga pyramidalis 4003 22 720 1533 599 153
Astragalus alpinus 1851 22 720 1533 599 153
Campanula rotundifolia 11703 22 720 1533 599 153
Carum carvi 2584 22 720 1533 599 153
Filipendula ulmaria 15967 22 720 1533 599 153
Galium album 3624 22 720 1533 599 153
Galium aparine 1722 22 720 1533 599 153
Galium boreale 7360 22 720 1533 599 153
Galium elongatum 1399 22 720 1533 599 153
Galium palustre 2180 22 720 1533 599 153
Galium saxatile 1299 22 720 1533 599 153
Galium uliginosum 1761 22 720 1533 599 153
Galium verum 3276 22 720 1533 599 153
Geranium robertianum 5497 22 720 1533 599 153
Geranium sylvaticum 13861 22 720 1533 599 153
Gymnadenia conopsea 3179 22 720 1533 599 153
Hieracium murorum 20 22 720 1533 599 153
Hieracium umbellatum 3663 22 720 1533 599 153
Hieracium vulgatum 26 22 720 1533 599 153
Hypochaeris radicata 1287 22 720 1533 599 153
Knautia arvensis 6030 22 720 1533 599 153
Lathyrus linifolius 6118 22 720 1533 599 153
Lathyrus pratensis 5458 22 720 1533 599 153
Lathyrus vernus 2013 22 720 1533 599 153
Leucanthemum vulgare 9160 22 720 1533 599 153
Lotus corniculatus 11190 22 720 1533 599 153
Nardus stricta 5966 22 720 1533 599 153
Origanum vulgare 2178 22 720 1533 599 153
Plantago lanceolata 5091 22 720 1533 599 153
Plantago major 6525 22 720 1533 599 153
Plantago media 2045 22 720 1533 599 153
Silene dioica 6888 22 720 1533 599 153
Silene vulgaris 3343 22 720 1533 599 153
Stellaria graminea 7784 22 720 1533 599 153
Stellaria media 3723 22 720 1533 599 153
Stellaria nemorum 3694 22 720 1533 599 153
Trifolium medium 4394 22 720 1533 599 153
Trifolium pratense 11519 22 720 1533 599 153
Trifolium repens 10479 22 720 1533 599 153
Trollius europaeus 2856 22 720 1533 599 153
Valeriana sambucifolia 106 22 720 1533 599 153
Veronica alpina 403 22 720 1533 599 153
Veronica chamaedrys 8302 22 720 1533 599 153
Veronica officinalis 10738 22 720 1533 599 153
Veronica serpyllifolia 711 22 720 1533 599 153
Vicia cracca 9257 22 720 1533 599 153
Vicia sepium 5705 22 720 1533 599 153
Vicia sylvatica 1969 22 720 1533 599 153
Viola biflora 2990 22 720 1533 599 153
Viola canina 3163 22 720 1533 599 153
Viola epipsila 423 22 720 1533 599 153
Viola palustris 7853 22 720 1533 599 153
Viola riviniana 8650 22 720 1533 599 153
Viola tricolor 3667 22 720 1533 599 153

2.1 Spatial and temporal resolution

Both pollinator and plant data obtained on a National scale (entire Norway) observed within \(2010\) to \(2024\). We present an illustration of the spatial distribution of the pollinator in the merged Presence-only and National insect monitoring in Norway datasets across the study region in Figure 2.

Figure 2: Distribution of bees, butterflies and hoverfies in the merged Presence-only and National insect monitoring in Norway datasets. The merged Presence-only dataset is colored by the taxon, and the National insect monitoring dataset is colored by the presence (1) or absence (0) value.

Iintend to keep either the table or the figures in the final document, depending on what is needed for the report.

2.2 Original units

The original unit of the each dataset obtained from GBIF for the pollinators and plants are provided in Table 5 and Table 6 respectively. The dataset are either presence-only and presence-absence.

2.3 Additional comments about the dataset

3. Indicator properties

Anders will help with this

3.1. ECT

3.2. Ecosystem condition characteristic

3.3. Other standards

3.4. Collinearities with other indicators

4. Reference condition and values

4. 1. Reference condition

4. 2. Reference values

4.2.1 Minimum and maximum values

Reference values for diversity indices at the ASO data meadows.

Shannon diversity indices at ASO data locations.

4.2.2. Threshold value for defining good ecological condition (if relevant)

4.2.3. Spatial resolution and validity

5. Uncertainties

6. References

7. Datasets

Here, we describe each of the dataset used for the modelling. We refer the readers to the dataset description provided on the GBIF website.

Table 4: Datasets obtained on selected plant species from GBIF, the type of dataset, total number of occurrence records in each dataset and its percentage in the total number of occurrence records. The dataset with type sampling_event is treated as presence-absence, occurrence as presence-only and insectMonitoring as count data.
Dataset Link to description
Bumble bees collected in a large-scale field experiment in power line clearings, southeast Norway https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee
Bumblebees and butterflies in Norway https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee
Ecofact https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee
Entomological collections, UiB [link](https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee)
Entomology collection, UiT Tromsø Museum NA
Entomology, Oslo (O) UiO NA
Freshwater benthic invertebrates ecological collection NTNU University Museum NA
Freshwater pelagic invertebrates ecological collection NTNU University Museum NA
Insect collection, UiT, University Museum (TSZ). Insect labeling project and PhD-duty-work NA
Insects of the Forest-Tundra Ecotone (ForTunE) NA
Invertebrate collections, UiB NA
Jordal NA
Lepidoptera collection, South Norway NA
Mapping and monitoring of the Glanville fritillary Melitaea cinxia NA
NINA insect database NA
NORSC - Sciaroidea, UiT Tromsø Museum NA
National insect monitoring in Norway https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee
Norwegian Biodiversity Information Centre - Other datasets NA
Norwegian Species Observation Service NA
Observation.org, Nature data from around the World NA
Occurrence data from various smaller projects in Norway NA
Saproxylic insects caught in window traps and hatched from polypores in small and large old forests in southern Norway https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee
Solitary bees collected in a large-scale field experiment in power line clearings, southeast Norway NA
Terrestrial and limnic invertebrates systematic collection, NTNU University Museum NA
iNaturalist Research-grade Observations NA
naturgucker NA

Here, intend to link all the datasets to their correct version on the GBIF repository.

7.1 Pollinator datasets

Table 5: Datasets obtained on pollinators from GBIF, the type of dataset, total number of occurrence records in each dataset and its percentage in the total number of occurrence records. The dataset with type sampling_event is treated as presence-absence, occurrence as presence-only and insectMonitoring as count data.
Dataset name Data type No. of observations Percent
Bumble bees collected in a large-scale field experiment in power line clearings, southeast Norway SAMPLING_EVENT 5016 0.28
Bumblebees and butterflies in Norway SAMPLING_EVENT 31112 1.75
Ecofact OCCURRENCE 2 0.00
Entomological collections, UiB OCCURRENCE 2757 0.16
Entomology collection, UiT Tromsø Museum OCCURRENCE 2 0.00
Entomology, Oslo (O) UiO OCCURRENCE 63124 3.56
Freshwater benthic invertebrates ecological collection NTNU University Museum SAMPLING_EVENT 11559 0.65
Freshwater pelagic invertebrates ecological collection NTNU University Museum SAMPLING_EVENT 17 0.00
Insect collection, UiT, University Museum (TSZ). Insect labeling project and PhD-duty-work OCCURRENCE 2048 0.12
Insects of the Forest-Tundra Ecotone (ForTunE) OCCURRENCE 216 0.01
Invertebrate collections, UiB OCCURRENCE 14 0.00
Jordal OCCURRENCE 48 0.00
Lepidoptera collection, South Norway OCCURRENCE 613 0.03
Mapping and monitoring of the Glanville fritillary Melitaea cinxia OCCURRENCE 785 0.04
NINA insect database OCCURRENCE 21342 1.20
NORSC - Sciaroidea, UiT Tromsø Museum OCCURRENCE 6886 0.39
National insect monitoring in Norway insectMonitoring 610146 34.41
Norwegian Biodiversity Information Centre - Other datasets OCCURRENCE 21 0.00
Norwegian Species Observation Service OCCURRENCE 971031 54.75
Observation.org, Nature data from around the World OCCURRENCE 4636 0.26
Occurrence data from various smaller projects in Norway OCCURRENCE 7 0.00
Saproxylic insects caught in window traps and hatched from polypores in small and large old forests in southern Norway SAMPLING_EVENT 122 0.01
Solitary bees collected in a large-scale field experiment in power line clearings, southeast Norway SAMPLING_EVENT 2699 0.15
Terrestrial and limnic invertebrates systematic collection, NTNU University Museum OCCURRENCE 32654 1.84
iNaturalist Research-grade Observations OCCURRENCE 6456 0.36
naturgucker OCCURRENCE 99 0.01

Out of the 26 datasets obtained from GBIF, over half of the total occurrence reports for the pollinators were from the Norwegian species Observation Service, a third of the total occurrence report were from the National insect monitoring in Norway, and the other 24 datasets cover 22 percent of the total pollinator occurrences.

Figure 3: Distribution of bees, butterflies and hoverfies caight in window traps and those collected in large-scale field experiment in power line clearings. The observations are colored by the presence (1) or absence (0) value.

After processing the data, it is important that we filter only the bees from the following datasets: Solitary bees collected in a large-scale field experiment in power line clearings (southeast Norway), Bumble bees collected in a large-scale field experiment in power line clearings (southeast Norway), as can be seen in Figure 3 and Figure 4.

Figure 4: Distribution of bees, butterflies and hoverfies NTNU university Museum collection and large scale field experiment in Norway. The observations are colored by the presence (1) or absence (0) value.

We also filter out bees and butterflies datasets from the Bumblebees and butterflies in Norway dataset, displayed in Figure 5.

Figure 5: Distribution of bees, butterflies and hoverfies in the Bumblebees and butterflies in Norway and freshwater pelagic invertebrates ecological collection from the NTNU University Museum datasets. The observations are colored by the presence (1) or absence (0) value.

7.2. Plant datasets

From our query for datasets from GBIF, we got a total of 30 datasets which we used to fit the ISDM for the plants. Out of these occurrence records, about 86 percent of these records were from the Norwegian species Observation Service.

Table 6: Datasets obtained on selected plant species from GBIF, the type of dataset, total number of occurrence records in each dataset and its percentage in the total number of occurrence records. The dataset with type sampling_event is treated as presence-absence, occurrence as presence-only and insectMonitoring as count data.
Dataset name Data type No. of observations Percent
ARKO strandeng OCCURRENCE 274 0.09
Ecofact OCCURRENCE 51 0.02
Effects of vegetation clearing on vascular plants in power line clearings southeast Norway SAMPLING_EVENT 779 0.24
Jordal OCCURRENCE 5434 1.70
Monitoring data of natural and man-made semi-natural meadows in and around Oslo, Norway 2018-2021 SAMPLING_EVENT 2310 0.72
Norwegian Species Observation Service OCCURRENCE 288837 90.51
Observation.org, Nature data from around the World OCCURRENCE 2993 0.94
Occurrence data from various smaller projects in Norway OCCURRENCE 775 0.24
Overvåking av semi-naturlig eng (ASO) SAMPLING_EVENT 1606 0.50
Overvåking av åpen grunnlendt kalkmark i Oslofjordområdet OCCURRENCE 1781 0.56
Pl@ntNet automatically identified occurrences OCCURRENCE 3265 1.02
Pl@ntNet observations OCCURRENCE 443 0.14
Stabbetorp - Floristiske registreringer 2016 OCCURRENCE 1209 0.38
Vascular Plant Herbarium, Oslo (O) UiO OCCURRENCE 1006 0.32
Vascular Plant Herbarium, UiB OCCURRENCE 45 0.01
Vascular Plants, Observations, Oslo (O) OCCURRENCE 192 0.06
Vascular plant herbarium (KMN) UiA OCCURRENCE 299 0.09
Vascular plant herbarium TRH, NTNU University Museum OCCURRENCE 659 0.21
Vascular plant herbarium, The Arctic University Museum of Norway (TROM) OCCURRENCE 208 0.07
Vascular plants in power line clearings and the nearby forest, southeast Norway SAMPLING_EVENT 269 0.08
Vegetation data with and without experimental warming, alpine Finse 2000, 2004, 2011 SAMPLING_EVENT 148 0.05
iNaturalist Research-grade Observations OCCURRENCE 6509 2.04
naturgucker OCCURRENCE 28 0.01
Figure 6: Distribution of plant species in the merged presence-only dataset.
Figure 7: Distribution of plant species in the ASO data. The observations are colored by the presence (1) or absence (0) value.

7.3. Covariates

We used several covariates to fit both the pollinator and plant ISDMs. The names of the covariates, their description and source are presented in Table 7. Additionally, we include the indication of whether the covariate is a habitat, trait or climatic variables as well as which of the ISDMs it was used to fit in Table 7. The rasters of the covariates are on \(1\) km resolution.

Table 7: Description of covariates used to fit the integrated species distribution models and the source the covariates were retrieved from. The table indicated what type of covariate (habitat, trait and climatic) and which of the distributions it was used for (pollinator and/or plant) ISDM
Name Description Source Habitat/ Trait/ Climatic Disribution
bio1 Annual temperature geodata R-package Climatic Pollinator, Plant
cellID 1 km grid cell created from the bio1 raster NA Trait NA
Latitude Latitudinal gradient extracted from the bio1 raster NA Trait Pollinator
mBio1 Mean pollinator annual temperature per each pollinator species NA Trait Pollinator
sdBio1 Standard deviation of pollinator annual temperature per each pollinator species NA Trait Pollinator
mLat Mean pollinator latitude per each pollinator species NA Trait Pollinator
sdLat Standard deviation of pollinator latitude per each pollinator species NA Trait Pollinator
RegCom NA NA Trait Pollinator
landCover Land cover raster NA Habitat Pollinator, Plant
soilPh NA NA Habitat Plant
soilCoarseFraction NA NA Habitat Plant
soilOrganicCarbon NA NA Habitat Plant
soilMoisture NA NA Climatic Plant
bio1sq Square of the annual temperature NA Climatic Pollinator
bmInt Interraction between the annual temperature and mean pollinator annual temperature per each pollinator species NA Climatic Pollinator
bmIntsq square of interraction between the annual temperature and mean pollinator annual temperature per each pollinator species NA Climatic Polinator

We present the distribution of the covariates described in Table 7 here in Figure 8.

Figure 8: Covariates used to fit the ISDM. The resolution of the covariates are on a 1 km resolution.

8. Spatial units

9. Analyses

We fitted an ISDM using the point process framework [REF] to the merged pollinator presence-only (PO) and presence-absence (PA) datasets. ISDMs are various observation models that are linked together by a shared ecological process.

9.1. General overview of ISDMs

In this subsection, we present an overview of the ISDMs for general understanding of this document. For further details on ISDM, refer to Isaac et al. (2020), Dorazio (2014) and Fithian et al. (2015). The models are described using the point process framework as described in Adjei et al. (2023). Additionally, we present the description of these models under the assumption that we are fitting a multi-species ISDM.

We model the presence-only data as a Poisson point process model (Illian et al. 2008) with mean intensity \(\lambda_i(s)\) for each species \(i = 1, \ldots, S\), where \(S\) is the number of species.This mean intensity modelled as: \[\begin{equation} \label{eq:int} \ln(\lambda_i(s)) = \beta_{0i} + \beta_{ji} X_j(s) + \omega_i(s) + \eta(s), \end{equation}\] where \(\beta_{0i}\) is the species-specific intercept, \(\beta_{ji}\) is the species-specific effect of covariate \(j\), \(X_j(s)\) is the \(j^{th}\) covariate field, \(\omega_i(s)\) is the species-specific spatial autocorrelation field and \(\eta(s)\) is the bias field. \(\omega_i(s)\) and \(\eta(s)\) are modeled as zero mean Gaussian random field (i.e. \(\omega_i(s) \sim N(0, \Sigma)\), where \(\Sigma\) is a Mat’ern covariance matrix with variance \(\sigma_{i}^2\) and range \(\kappa_i\) and \(\eta(s) \sim N(0, \Sigma_\eta)\), where \(\Sigma_\eta\) is a Mat’ern covariance matrix with variance \(\tau^2\) and range \(\kappa_\eta\)).

We model the presence-absence data with a logistic regression model. Let \(y_i(s)\) be a binary value at location \(s\) with \(0\) indicating absence of species \(i\) and \(1\) indicating presence of species \(i\). Then \(y_i(s) \sim \text{Bernoulli}(\Psi_i(s))\) with: \[ cloglog(\Psi_i(s)) = \beta_{0i} + \beta_{ji} X_j(s) + \omega_i(s) \] where the parameters \(\beta_0\), \(\beta_1\) and \(\omega\) are defined in equation \(\ref{eq:int}\).

All the ISDMs were fitted using the R-package PointedSDMs (Mostert and O’Hara 2023).

9.2. ISDM for pollinators

Using the model framework defined above, we fitted the ISDM to the pollinator dataset obtained from GBIF. Due to memory available, we fit the ISDM seperately for each pollinator.

Code
# Load data and covariates
source("pipeline/dataImport/importPollinatorFromGBIF.R")
source("pipeline/dataImport/environmentalDataImport.R")

# Set the model
model <- PointedSDMs::startSpecies(PointsBeesAndButterfliesAndHoverflies, # list of pollinator dataset (containing both mergedDatasetPA and mergedDatasetPO) 
                                   Boundary = regionGeometry, # boundary of the study
                                   Projection = newCrs, # projection
                                   Mesh = meshForProject, #mesh used for the model
                                  speciesEnvironment = TRUE, # Should we have pollinator specific covariate effect
                                  speciesIntercept = TRUE, # TRUE treats the intercept as a random effect, instead of constrained as with a fixed effect 
                                  pointsIntercept = FALSE, #should we include intercept for each dataset
                                  pointCovariates = FALSE, #do we have covariates for the presence-only data
                                   responsePA = 'individualCount', # column name of the response values for presence-absence data
                                  # trialsPA = 'trials',
                                   spatialCovariates = envCovsForModel, # raster of spatia covariates
                                   speciesName = 'Taxon', # Names of the species in the datasets
                                   pointsSpatial = NULL, # Do not include a dataset spatial field
                                   speciesSpatial = "replicate")   # unique spatial field per species, but they share the same hyperparameters.

# Add second bias field for PO data
model$addBias(datasetNames = 'mergedDatasetPO')

9.2.1 Priors

We assume the following priors for the pollinator ISDM:

  • The probability that the standard deviation of the pollinator spatial field is greater than \(1\) is \(0.1\) (i.e. \(P(\sigma_\omega > 0.1) = 0.1\)).

  • The probability that the standard deviation of the bias field is greater than \(1\) is \(0.1\) (i.e. \(P(\sigma_\eta > 0.1) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is less than \(15\)km is \(0.1\) (i.e \(P(\kappa_\omega < 15) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is less than \(15\)km is \(0.1\) (i.e. \(P(\kappa_\eta < 15) = 0.1\))

  • effects of continuous covariates (all covariates except land cover) and the intercept for each pollinator is assumed to be from a Normal distribution with mean \(0\) and variance of \(1\).

Code
# hyper parameters of the spatial field (shared across species)
model$specifySpatial(Species = TRUE,  # define same prior for the all species
                     prior.sigma = c(1, 0.1),  # SD of field variation; 
                     prior.range = c(15, 0.1))  # range of spatial correlation; 


model$specifySpatial(Bias = TRUE, # Change the prior
                                 prior.sigma = c(1, 0.1),
                                 prior.range = c(15, 0.1))

model$specifyRandom(
  # precision parameter on how much each species' spatial field (how much they can deviate from the shared ___)
  speciesGroup = list(model = "iid", 
                      hyper = list(prec = list(prior = "pc.prec",
                                               param = c(0.1, 0.1)))), 
  # precision parameter on the baseline species occurrence rate
  speciesIntercepts = list(prior = 'pc.prec',
                           param = c(0.1, 0.1)))  

# Specify priors for covariate effects (continuous)
covariatesToAddEffects <- c("bio1", "bio.sq", "bmInt", "RegCom", "mBio1", "sdBio1", "sdmInt", "Latitude", "mLat", "sdLat", "mlatInt")
for(covs in covariatesToAddEffects){
  model$priorsFixed(Effect = covs,
                    mean.linear = 0,
                    prec.linear = 1)
}

9.2.2 Fit model and make predictions

We fit the model and make predictions of the pollinator intensity within the study region. We fit the model by using the adaptive strategy of the composite design integration strategy in INLA (Rue, Martino, and Chopin 2009).

Code
modelOptions <- list(num.threads = 4,
                     control.inla = list(int.strategy = 'ccd', 
                                         diagonal = 0.001, 
                                         cmin = 0, 
                                         strategy = "adaptive",
                                         control.vb= list(enable = FALSE)), 
                     verbose = FALSE, 
                     safe = TRUE, 
                     inla.mode = "experimental")

modelRun <- PointedSDMs::fitISDM(data = model, 
                            options = modelOptions)
  
  # predictions for this model
  individualDatasetPreds <- predict(modelRun,
          data = ppxl,
          predictor = TRUE,
          n.samples = 100)

9.2.3 Pollinator ISDM validation

We perform a five-fold cross validation to assess the trait effect on the pollinator distribution across the study region.

9.3. ISDM for plant species

Modelling all the \(54\) species required memoey allocations we did not have, so we split the species into groups with \(10\) species in each group. We fitted independent ISDMs for each of the groups.

Code
# PointedSDMs takes a longer time to fit for many species
# The trick to to break it into smaller groups
# The split will be done by the number of present speccies in each location
# I will split it nGroups time
sortedSpecies <- table(asoDatasf$acceptedScientificName)%>%
  as.data.frame()%>%
  dplyr::arrange(desc(Freq))%>%
  select(Var1)%>%
  c()

# Set the number of groups
nGroups <- 10

#split the species into groups of 10 species in each
plantSpeciesGroup <- split(sortedSpecies$Var1, 
                           seq(1, 
                               ceiling(length(sortedSpecies$Var1)/nGroups)))

Similar to the model the pollinator ISDM, we fitted an ISDM to the \(54\) plant species. The ISDM has species-specific intercept, covariate effect and spatial field (but all the species share the same hyperparameters). We also added a spatial bias field to the observation model for the presence-only dataset.

Code
speciesModelShared <- PointedSDMs::startSpecies(formatPlantData,
                                                Boundary = regionGeometry, 
                                   Projection = newCrs, 
                                   Mesh = meshForProject,
                                   responsePA = 'individualCount',
                                   speciesEnvironment = TRUE,
                                   speciesIntercept = TRUE,
                                   spatialCovariates = envCovsForPlantSpeciesModel, 
                                   speciesName = 'simpleScientificName',
                                   pointsIntercept = FALSE,
                                   pointsSpatial = NULL, # Do not include a dataset spatial field
                                   speciesSpatial = "replicate")  

# Add bias spatial field for the PO dataset
speciesModelShared$addBias(datasetNames = 'mergedPlantsPO')

9.3.1 Priors

We assume the following priors for the pollinator ISDM:

  • The probability that the standard deviation of the pollinator spatial field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\omega > 0.1) = 0.1\)).

  • The probability that the standard deviation of the bias field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\eta > 0.1) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e \(P(\kappa_\omega < 50) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e. \(P(\kappa_\eta < 50) = 0.1\))

Code
# hyper parameters of the spatial field (shared across species)
speciesModelShared$specifySpatial(Species = TRUE,  # define same prior for the all species
                     prior.sigma = c(1, 0.1),  
                     prior.range = c(5, 0.1))  


speciesModelShared$specifySpatial(Bias = TRUE, # Change the prior
                     prior.sigma = c(0.6, 0.1),
                     prior.range = c(5, 0.1))

# prior for random effects (mesh nodes of spatial field and species intercepts)
speciesModelShared$specifyRandom(
  # precision parameter on how much each species' spatial field (how much they can deviate from the shared ___)
  speciesGroup = list(model = "iid", 
                      hyper = list(prec = list(prior = "pc.prec",
                                               param = c(0.1, 0.1)))),  
  # precision parameter on the baseline species occurrence rate
  speciesIntercepts = list(prior = 'pc.prec',
                           param = c(0.1, 0.1)))  

9.3.2 Model fitting and predictions

We fit the model by using the adaptive strategy of the composite design integration strategy in INLA.

Code
# specify the model options for INLA
modelOptions <- list(num.threads = 4,
                     control.inla = list(int.strategy = 'ccd', 
                                         diagonal = 0.001, 
                                         cmin = 0, 
                                         strategy = "adaptive",
                                         control.vb= list(enable = FALSE)), 
                     verbose = FALSE, 
                     safe = TRUE, 
                     inla.mode = "experimental")

# Species-specific spatial effects model
speciesSharedEst <- PointedSDMs::fitISDM(data = speciesModelShared, 
                            options = modelOptions)

# I proceed with the prediction of occupancy probabilities
# over the entire region
individualDatasetPreds <- predict(speciesSharedEst,
                                  data = ppxl,
                                  predictor = TRUE,
                                  n.samples = 500)

9.4 Species Interractions

Code
  load("../data/webPlot.RData")
  #interractionMatrix <- readr::read_csv(paste0(dataFolder,"/interractionsData/interractionMatrix.csv"))
  
    bipartite::plotweb(WebReady[[1]])

Webplot of plant species and pollinator interractions.

9.5. Diversity Indices

To define the diversity of the pollinators, we estimate the pollinator intensity given their interaction with the plants species within the location \(s\). This is estimated as: \[\begin{equation} \begin{split} \text{Intensity for intensity} &\propto \text{Estimated pollinator intensity} \times \text{Interraction weight} \times \text{plant species occurrence probability}\\ \implies \lambda^{\star}_i(s) &= \frac{\lambda_i(s) \times w_{ik} \times \Psi_k(s)}{\sum_k \lambda_i(s) \times w_k \times \Psi_k(s)}, \end{split} \end{equation}\] where \(w_k\) is the weight of the interaction between pollinator \(i\) and plant species \(k\), \(\Psi_k(s)\) is the occurrence probability of plant species \(k\) and \(\lambda_i(s)\) is the intensity of pollinator \(i\).

We estimated the Hill’s diversity indices for the pollinators. The Hill’s diversity indices are defined as: \[ H(s) = r_i(s) \times log(r_i(s)); \] where \(r_i(s) = \frac{\lambda^{\star}_i(s)}{\sum_{i} \lambda^{\star}_i(s)}\).

10. Results

10.1. Distribution of pollinators accross Norway

We present the predictions of log-intensity of the pollinators in Figure 9 and its standard error in Figure 10. The log-intensity for the three pollinators: bees, butterflies and hoverflies are similar with standard error close to \(0\). This result can be seen as a direct consequence of the estimates of climatic and trait variables on the pollinator distribution (Figure 11 and Figure 12).

Figure 9: Log intensity of pollinators (bees, butterfliesand hoverflies) across the study region from the fitted integrated species distribution model.
Figure 10: Standard error of the log-intensity of pollinators (bees, butterfliesand hoverflies) across the study region from the fitted integrated species distribution model.

The effects of traits and climatic variables on the pollinator distribution in Figure 11. There is a negative latitudinal effect on bees and butterflies, but a positive effect on hoverflies. However, there is a positive effect of annual temperature on butterflies and bees, but a negative effect on hoverflies. There is an insignificant trait effects on the hoverflies distribution as their credible interval of these covariates includes \(0\). The contrast of this statement is true for the bees and butterflies distribution.

Figure 11: Effects of traits and climatic variables on the pollinator distribution.

In contrast to the trait and climatic variables, there seems to be … (Figure 12).

Figure 12: Plant species land cover effects.
Figure 13: Plant species Predictions.
Figure 14: Diversity of pollinators across Norway.

11. Export file

12. References

Adjei, Peprah Kwaku, Philip Mostert, Jorge Sicacha Parada, Emma Skarstein, and Robert B O’Hara. 2023. “The Point Process Framework for Integrated Modelling of Biodiversity Data.” arXiv e-Prints, arXiv–2311.
Chamberlain, Scott, Vijay Barve, Dan Mcglinn, Damiano Oldoni, Peter Desmet, Laurens Geffert, and Karthik Ram. 2024. Rgbif: Interface to the Global Biodiversity Information Facility API. https://CRAN.R-project.org/package=rgbif.
Dorazio, Robert M. 2014. “Accounting for Imperfect Detection and Survey Bias in Statistical Analysis of Presence-Only Data.” Global Ecology and Biogeography 23 (12): 1472–84. https://doi.org/10.1111/geb.12216.
Fithian, William, Jane Elith, Trevor Hastie, and David A. Keith. 2015. “Bias Correction in Species Distribution Models: Pooling Survey and Collection Data for Multiple Species.” Methods in Ecology and Evolution 6 (4): 424–38. https://doi.org/10.1111/2041-210X.12242.
GBIF.Org User. 2024a. “Occurrence Download.” The Global Biodiversity Information Facility. https://doi.org/10.15468/DL.ZSEX2M.
———. 2024b. “Occurrence Download.” The Global Biodiversity Information Facility. https://doi.org/10.15468/DL.AWRNR2.
Illian, Janine, Antti Penttinen, Helga Stoyan, and Dietrich Stoyan. 2008. Statistical Analysis and Modelling of Spatial Point Patterns. John Wiley & Sons.
Isaac, Nick JB, Marta A Jarzyna, Petr Keil, Lea I Dambly, Philipp H Boersch-Supan, Ella Browning, Stephen N Freeman, et al. 2020. “Data Integration for Large-Scale Models of Species Distributions.” Trends in Ecology & Evolution 35 (1): 56–67. https://doi.org/10.1016/j.tree.2019.08.006.
Mostert, Philip S, and Robert B O’Hara. 2023. “PointedSDMs: An r Package to Help Facilitate the Construction of Integrated Species Distribution Models.” Methods in Ecology and Evolution 14 (5): 1200–1207.
Rue, Håvard, Sara Martino, and Nicholas Chopin. 2009. “Approximate Bayesian Inference for Latent Gaussian Models Using Integrated Nested Laplace Approximations (with Discussion).” Journal of the Royal Statistical Society B 71: 319–92.